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Abstract. Define the weight of a matrix to be the number of non-zero entries. One would Hke to 
count m by n matrices over a finite field by their weight and rank. This is equivalent to determining 
the probability distribution of the weight while conditioning on the rank. The complete answer to 
this question is far from finished. As a step in that direction this paper finds a closed form for the 
average weight of an m by n matrix of rank k over the finite field with q elements. The formula is a 
simple algebraic expression in m, n, k, and q. For rank one matrices a complete description of the 
weight distribution is given and a central limit theorem is proved. 
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1. Introduction. For an m x n matrix A over the finite field Fg the weight 

of A, denoted wt A, is the number of non-zero entries. In the Hamming metric on 
matrices it is the distance from ^ to 0. 

There is some relationship between the rank and the weight of a matrix. For 
example, if wt A = 1, then rkA — 1, and if rkA — k, then wtA>k. On the other 
hand, there are matrices of rank one and maximal weight mn, such as a matrix with 
every entry a one. In this article we determine the average weight of rank k matrices 
in terms of fc, m, n, and q. Without fixing the rank, the average weight of m x rt 
matrices is mn(l — i) and the weight has a binomial distribution. However, the full 
probability distribution of the weight for matrices of rank k is yet to be determined. 

The tools are those of elementary combinatorics and linear algebra. Nothing 
special is used from the theory of finite fields other than the understanding that the 
fundamental ideas of linear algebra work over all fields and not just the real or complex 
numbers. 

We need a modest amount of background material. We use three basic formulas. 
Formula 1.1. The number of ordered k -tuples of linearly independent vectors in 

(g"-l)(g"-g)(g"-q2)...(q"_5^-i) 



Proof. The first vector is any non-zero vector and each succeeding vector must 
avoid the span of the previous vectors. □ 

Formula 1.2. The number of k- dimensional subspaces of¥q" is given by the 
q-binomial coefficient 



{q" - l){q- - q){q^ ^ q^) . . . (q- 



{q'' - l)(g'= - q){q'' - q^) ■ ■ ■ {q'' 
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Proof. The numerator is the number of bases of all fc-dimensional subspaces, 
while the denominator is the number of bases of any given subspace. □ 
Formula 1.3. The number of m x n matrices of rank k is 



(9- 



l){q"'-q)---{r-q'-') 



-,k-l\ 



{q^ — l){q^ — q){q^ — q^) ■ ■ ■ (q^ — 9*^"^) 



Proof. For a fixed fc-dimensional subspace W C Fg™, the number of matrices 
with W as the column space is equal to the number of fc x n matrices of rank fc. Such 
a matrix is given by the fc linearly independent row vectors of length n. The number 
of those is given by Formula 1. The number of fc-dimensional subspaces of Fg™ is 
[™] ^ and the product is the number of rank fc matrices given in the first line. By the 
same reasoning, the second line counts the number of n x m matrices of rank fc, which 
is the same. □ 

A special case of Formula 3 is worth noting. The number of invertible n x n 
matrices is 

(q"-l)((7"-g)---((?"-g"-^) 

2. Average Weight. The average weight of a rank fc matrix is the sum of the 

average weights of the entries, and the average weight of the ij entry is the probability 
that the entry is not zero: 



E(wtA) = ^P(ay ^ 0) 



Theorem 2.1. The probability that Oij ^ for a rank fc matrix A is the same 
for all i and j . 

Proof. The probability that the ij entry is not zero is the quotient whose nu- 
merator is the number of matrices A of rank fc with ay ^ 0, and whose denominator 
is the number of matrices of rank fc. Consider the map on the m x n matrices that 
switches rows 1 and i and switches columns 1 and j. This map preserves rank and 
gives a bijection between the subset of matrices of rank fc with a non-zero in the 1,1 
location and the subset of matrices of rank fc with a non-zero in the i,j location. 
Thus, 'P{a^j ^ 0) = P[aii 0). □ 

Call this common value the average weight per entry. Now we focus on the 
upper left corner of the matrices of rank fc. Our analysis depends on the reduced row 
echelon form. We recall the definition 1 . 

Definition 2.2. A rectangular matrix is in row echelon form if it has the 
following three properties: 
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1. All non-zero rows are above any rows of all zeros. 

2. Each leading entry of a row is in a column to the right of the leading entry 

of the row above it. 

3. All entries in a column below a leading entry are zero. 

If a matrix in echelon form satisfies the following additional conditions, then it is in 

reduced row echelon form: 

4. The leading entry in each nonzero row is 1 

5. Each leading 1 is the only non-zero entry in its column. 

The kxn matrices in reduced row echelon form correspond bijectively with the k- 
dimensional subspaces of Fg". The rows of the matrix give a basis of the corresponding 
subspace. When an to x n matrix A is reduced to reduced row echelon form by row 
operations, the result is an to x n matrix whose first k rows form a basis of the row 
space of A. Let R be the kxn matrix consisting of the k non-zero rows of the reduced 
form. Then A and all matrices with the same row space can be constructed from R 
by multiplying R on the left by an to x fc matrix C of rank k. The matrix C is uniqiie. 
This gives a factorization of A as A = C R. In terms of the associated linear maps, 
A is a linear map from F," to Fg™, which factors into a surjective map onto Fg*^ 
followed by an injective map from Wg'' to Fg™. Recall that knowing the row space 
of a matrix is equivalent to knowing the kernel of the associated linear map. Thus, 
when the reduced matrix R is held fixed and C is varied, the product CR gives all 
maps with the same row space (i.e. same kernel). 

Theorem 2.3. For m x n matrices of rank k, the average weight per entry is 



Proof. We consider the random selection of a rank k matrix such that each such 
matrix is equally probable. With the factorization A = CR, this can be done by 
selecting C uniformly from all to x fc matrices of rank k and by selecting R indepen- 
dently from among all reduced row echelon matrices, which is the same as selecting 
the row space uniformly from all fc-dimensional subspaces of Fg". The upper left 
corner of A is an = cnrn + Ci2r2i -I- • • • -I- ci^rki- But because R is in reduced row 
echelon form, the first column of R is either all zeros or has a leading 1 followed by 
zeros. Thus, an = cnrn. In order for an to be non-zero, both rn and cn must be 
non-zero. Since the selection of R is independent of the selection of C, 



The columns of C are k linearly independent vectors of length to and so the first 

column is not the zero vector. That means there are q"^ — 1 possible first column 
vectors. There are {q — 1) choices for cn and q™~^ choices for the remaining 
entries of the first column. Therefore, 




P(aii 7^ 0) = P(cii 7^ 0)P(rii 7^ 0) 



(g - I)?"' 
qm _i 



,m—l 



,m—l 



P(cii 7^ 0) = 



- 1 



4 



T. Migler, K.E. Morrison, M. Ogle 



Now rii = or 1, and rn = when the row space of R contains nothing in the 
direction of the vector (1, 0, 0, . . . , 0), which is to say that the row space is contained 
in the {n — l)-dimensional space {(0,a;2, . . . ,a;„)}. Therefore, 



P(rii = 0) = 




P(rn^O) = l-i^ 

LfeJq 

Using Formula 2 one easily obtains 

Putting these results together we have 

m _ m-l / n-k ^ 



qm _i y - 1 

^qm _ qm-i^^qfi _ qn-k^ 

('-|)('-^) 



With this result we have a clear picture of the effect of the parameters k, m, and 
n on the average weight. The factor 1 — 1/g' is the average weight per entry without 
the rank condition, in which case the matrix size docs not matter. Note that with 
m and n fixed, it is more likely for an entry to be non-zero for matrices of larger 
rank, an intuitively plausible result because both weight and rank are some measure 
of distance from the zero matrix. Also, one can sec that as m, n, and k simultaneously 
go to infinity, the probability approaches 1 — 1/q, which is again the unconditioned 
probability. For invertible matrices of size n (i.e. k = m = n) the probability of a 
non-zero entry is 

1-i 



l-4r 



3. Weight of Rank One Matrices. For the matrices of rank one a more 

complete analysis of the weight distribution is possible. In this case C is a non-zero 
column vector of length m and i? is a non-zero row vector of length n whose leading 
non-zero entry is 1. The rank one matrix A = CR is given by atj = Ctrj, and so the 
weight of A is the product of the weights of C and R. The weight of C has a binomial 
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distribution conditioned on the weight being positive (the entries of C cannot aU be 
zero) 



P(wt C = n) 



(9" - 1) 

Likewise for R the weight distribution is given by 



P(wt R = u) = 



(q» - 1) 



(To select a random R, choose a random non-zero vector of length n and then scale 
it to make the leading non-zero entry 1. The scaling does not change the weight.) 

The weight on rank one matrices is the product of these two binomial random 
variables, each conditioned to be positive. 

(3.1) P(wt A = uj)= ^ P(wt C = ;u)P(wt R = v) 

_ sr^ (m\ (n\ (g - l)m+n-^-,. 



(g™ - l)(g" - 1) 



Not all weights between 1 and mn occur for rank 1 matrices since the weight is a 
product with one factor no greater than m and the other factor no greater than n. 
Plots of actual probability distributions show spikes and gaps. Plots of cumulative 
distributions are smoother and lead us to expect a limiting normal distribution. See 
Figures 1 and 2. 

Theorem 3.1. As m or n goes to infinity, the weight distribution of rank one 
matrices approaches a normal distribution. 

Proof. The weight random variable for rank one matrices of size to x n is the 
product of independent binomial random variables conditioned on being positive. 

Define W = XY , where X = Z]i<i<m^»' ^ = Hi<j<n^j^ ^-iid ^"^^ ^'^^ 
independent Bernoulli random variables with probability T/g of being 0. Then W is 
the sum of to independent identically distributed random variables XiY. Conditioning 
on W > is the weight of rank one matrices. By the Central Limit Theorem the 
distribution of W converges, as to, ^ oo, to a normal distribution after suitable 
scaling. Now conditioning on W being positive does not change this result because 
the probability that W > is 1 — q^"^ , which goes to 1 as to ^ oo. □ 

Now to compute the mean and variance of the weight, let W = XY as before but 
without conditioning on X or F being positive. Then E(VF) = TOn(l — 1/q)^ and 



/ 

v 



Expanding and using the independence of the random variables Xi,Yj and the fact 
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Fig. 3.1. Distribution for the weight of rank 1 matrices, m = n = 25, q = 2. 



that Xf = Xi and F/ = Yj, we get 



E(VK^) = mn ( 1 - +mn(m + n-2) ^1 - i 



+ m{m — l)n(n — 1) ( 1 



The variance of the weight is 



var {W\W > 0) = E(W^2|W > 0) - ^{W\W > of 



F,(W) 

P{W > 0) \P{W > 0) 



Furthermore 



Piw > 0) = p(x > o)P(y > 0) = (1 - ^)(i - -L) 

Combining these expressions we get 

YaT{W\W > 0) = 

mn ^1 — 1^ + mn{m + n — 2) ^1 — + m(m — l)n(n — 1) ^1 — ^ j 
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Fig. 3.2. Cumulative frequency distribution for the weight of rank 1 matrices, m = n = 25, 
q = 2. The smooth curve is the normal cdf with the same mean 156.25^ and standard deviation 
(a 44.63;. 



(mn (l - i 



We get a good approximation to the variance of the weight when m and n using the 
unconditioned W, essentially the factors in the denominator by 1. Thus, 



var (W) = E{W^) - E{W) 

2 



= nin (1 I + mnim + n — 2)(l — - 



+ m{m — l)n{n — 1) ( 1 — - 



mn I 1 



which can be simplified to give 



var (W) = mn{l — n — m) ( 1 



+ mn{n + m — 2) ( 1 
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Prom this we can see, for example, that for m ^ n, m,n ^ oo, the variance is on 
the order of and the standard deviation is of order n. 



+ mn 11—^ 
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